Crowdsourcing
Crowdsourced Clustering: Querying Edges vs Triangles
We consider the task of clustering items using answers from non-expert crowd workers. In such cases, the workers are often not able to label the items directly, however, it is reasonable to assume that they can compare items and judge whether they are similar or not. An important question is what queries to make, and we compare two types: random edge queries, where a pair of items is revealed, and random triangles, where a triple is. Since it is far too expensive to query all possible edges and/or triangles, we need to work with partial observations subject to a fixed query budget constraint. When a generative model for the data is available (and we consider a few of these) we determine the cost of a query by its entropy; when such models do not exist we use the average response time per query of the workers as a surrogate for the cost. In addition to theoretical justification, through several simulations and experiments on two real data sets on Amazon Mechanical Turk, we empirically demonstrate that, for a fixed budget, triangle queries uniformly outperform edge queries. Even though, in contrast to edge queries, triangle queries reveal dependent edges, they provide more reliable edges and, for a fixed budget, many more of them. We also provide a sufficient condition on the number of observations, edge densities inside and outside the clusters and the minimum cluster size required for the exact recovery of the true adjacency matrix via triangle queries using a convex optimization-based clustering algorithm.
- Information Technology > Communications > Social Media > Crowdsourcing (0.39)
- Information Technology > Artificial Intelligence > Machine Learning (0.39)
Semi-crowdsourced Clustering with Deep Generative Models
We consider the semi-supervised clustering problem where crowdsourcing provides noisy information about the pairwise comparisons on a small subset of data, i.e., whether a sample pair is in the same cluster. We propose a new approach that includes a deep generative model (DGM) to characterize low-level features of the data, and a statistical relational model for noisy pairwise annotations on its subset. The two parts share the latent variables. To make the model automatically trade-off between its complexity and fitting data, we also develop its fully Bayesian variant. The challenge of inference is addressed by fast (natural-gradient) stochastic variational inference algorithms, where we effectively combine variational message passing for the relational part and amortized learning of the DGM under a unified framework. Empirical results on synthetic and real-world datasets show that our model outperforms previous crowdsourced clustering methods.
- Information Technology > Communications > Social Media > Crowdsourcing (0.65)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.61)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.32)
Google scraps AI search feature that crowdsourced amateur medical advice
Google had said'What People Suggest' feature aimed to provide users with information from people with similar lived experiences. Google had said'What People Suggest' feature aimed to provide users with information from people with similar lived experiences. Google has dropped a new artificial intelligence search feature that gave users crowdsourced health advice from amateurs around the world. The company had said its launch of "What People Suggest", which provided tips from strangers, showed "the potential of AI to transform health outcomes across the globe". But Google has since quietly removed the feature, according to three people familiar with the decision.
- Europe > Ukraine (0.06)
- Oceania > Australia (0.05)
- North America > United States > New York (0.05)
- Europe > Switzerland (0.05)
- Leisure & Entertainment > Sports (0.72)
- Health & Medicine > Consumer Health (0.71)
- Government > Regional Government (0.51)
- Information Technology > Information Management > Search (1.00)
- Information Technology > Artificial Intelligence (1.00)
- Information Technology > Communications > Social Media > Crowdsourcing (0.62)
'Pokémon Go' players have been unknowingly training delivery robots
Technology Robots'Pokémon Go' players have been unknowingly training delivery robots The massive crowdsourcing effort could use real-world to help robots deliver pizza. A woman holds up her cell phone as she plays the Pokémon Go game in Lafayette Park in front of the White House in Washington, DC on July 12, 2016. Breakthroughs, discoveries, and DIY tips sent six days a week. Nearly a decade ago, turned the real world into a digital scavenger hunt, with virtual creatures hiding in plain sight. The early augmented reality smartphone app prompted hundreds of millions of players to wander into parks, parking lots, and even dimly lit alleyways, peering through their phone cameras in search of Pikachus and Charizards that the app superimposed onto their surroundings.
- North America > United States > District of Columbia > Washington (0.25)
- North America > United States > New York (0.05)
- Asia > China (0.05)
- Information Technology > Communications > Social Media > Crowdsourcing (0.71)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.63)
- North America > United States > New York (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.98)
- Information Technology > Communications > Social Media > Crowdsourcing (0.82)
- North America > Canada > Quebec > Montreal (0.16)
- Africa > Kenya (0.08)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Communications > Social Media > Crowdsourcing (0.41)
Label Poisoning is All You Need
In a backdoor attack, an adversary injects corrupted data into a model's training dataset in order to gain control over its predictions on images with a specific attacker-defined trigger. A typical corrupted training example requires altering both the image, by applying the trigger, and the label. Models trained on clean images, therefore, were considered safe from backdoor attacks. However, in some common machine learning scenarios, the training labels are provided by potentially malicious third-parties. This includes crowd-sourced annotation and knowledge distillation. We, hence, investigate a fundamental question: can we launch a successful backdoor attack by only corrupting labels?
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- (3 more...)
- Workflow (0.68)
- Research Report > New Finding (0.46)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- North America > Canada (0.04)
- (8 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
- Information Technology > Communications > Social Media > Crowdsourcing (0.87)
- Information Technology > Data Science (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Checklist 1. For all authors (a)
Do the main claims made in the abstract and introduction accurately reflect the paper's If you ran experiments (e.g. for benchmarks)... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Y es] See A.2 (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they Did you report error bars (e.g., with respect to the random seed after running experiments multiple times)? Did you include the total amount of compute and the type of resources used (e.g., type Did you include any new assets either in the supplemental material or as a URL? [Y es] Did you discuss whether and how consent was obtained from people whose data you're If you used crowdsourcing or conducted research with human subjects... (a) For a detailed description and intended uses, please refer to 1. A.2 Dataset Accessibility We plan to host and maintain this dataset on HuggingFace. A.4 Dataset Examples Example question-answer pairs are provided in Tables 9 10 11, . Example Question "What does the symbol mean in Equation 1?" Answer "The symbol in Equation 1 represents "follows this distribution". "Can you provide more information about what is meant by'generative process in "The generative process refers to Eq. (2), which is a conceptual equation representing Question "How does the DeepMoD method differ from what is written in/after Eq 3?" Answer "We add noise only to Question "How to do the adaptive attack based on Eq.(16)? "By Maximizing the loss in Eq (16) using an iterative method such as PGD on the end-to-end model we attempt to maximize the loss to cause misclassification while Question "How does the proposed method handle the imputed reward?" "The proposed method uses the imputed reward in the second part of Equation 1, "Table 2 is used to provide a comparison of the computational complexity of the "Optimal number of clusters affected by the number of classes or similarity between "The authors have addressed this concern by including a new experiment in Table 4 of Question "Can you clarify the values represented in Table 1?" Answer "The values in Table 1 represent the number of evasions, which shows the attack "The experiments in table 1 do not seem to favor the proposed method much; softmax Can the authors explain why this might be the case?" Answer "The proposed method reduces to empirical risk minimization with a proper loss, and However, the authors hope that addressing concerns about the method's theoretical Question "Does the first row of Table 2 correspond to the offline method?"
- Oceania > New Zealand (0.04)
- Oceania > Australia (0.04)
- North America > United States (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.54)
- Information Technology > Communications > Social Media > Crowdsourcing (0.34)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- North America > United States > Florida > Alachua County > Gainesville (0.14)
- North America > United States > Oregon > Benton County > Corvallis (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.68)
- Information Technology > Communications > Social Media > Crowdsourcing (0.67)
- (2 more...)